Reducing the Synchronization Overhead in Parallel Nonsymmetric Krylov Algorithms on MIMD Machines

نویسندگان

  • Muthucumaru Maheswaran
  • Kevin J. Webb
  • Howard Jay Siegel
چکیده

By considering electromagnetic scattering problems as examples, a study of the performance and scalability of the conjugate gradient squared (CGS) algorithm on two MIMD machines is presented. A modified CGS (MCGS) algorithm, where the synchronization overhead is effectively reduced by a factor of two, is proposed in this paper. This is achieved by changing the computation sequence in the CGS algorithm. Both experimental and theoretical analyses were performed to investigate the impact of this modification on the overall execution time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mcgs: a Modiied Conjugate Gradient Squared Algorithm for Nonsymmetric Linear Systems

The conjugate gradient squared (CGS) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for linear systems (Ax = b) with complex nonsymmetric, very large, and very sparse coeecient matrices (A). By considering electromagnetic scattering problems as examples, a study of the performance and scalability of this algorithm on two MIMD machines is presented. A modiied ...

متن کامل

MCGS A Modi ed Conjugate Gradient Squared Algorithm for Nonsymmetric Linear Systems

The conjugate gradient squared CGS algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for linear systems Ax b with complex nonsymmetric very large and very sparse coe cient matrices A By considering electromagnetic scattering problems as examples a study of the performance and scalability of this algorithm on two MIMD machines is presented A modi ed CGS MCGS algo...

متن کامل

Optimizing the Emulation of MIMD Behavior on SIMD Machines

SIMD computers have proved to be a useful and cost eeective approach to massively parallel computation. On the other hand, there are algorithms which are very ineecient when directly translated into a data-parallel program. This paper presents a number of simple transformations which are able to reduce this SIMD overhead to a moderate constant factor. In particular , this factor is often much s...

متن کامل

Eecient Emulation of Mimd Behavior on Simd Machines

SIMD computers have proved to be a useful and cost eeective approach to massively parallel computation. On the other hand, there are algorithms which are very ineecient when directly translated into a data-parallel program. This paper presents a number of simple transformations which are able to reduce this SIMD overhead to a moderate constant factor. It also introduces techniques for reducing ...

متن کامل

A Parallel Algorithm for Connected Components on Distributed Memory Machines

Finding connected components (CC) of an undirected graph is a fundamental computational problem. Various CC algorithms exist for PRAM models. An implementation of a PRAM CC algorithm on a coarse-grain MIMD machine with distributed memory brings many problems, since the communication overhead is substantial compared to the local computation. Several implementations of CC algorithms on distribute...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998